Search CORE

58 research outputs found

Paying attention to cardiac surgical risk: An interpretable machine learning approach using an uncertainty-aware attentive neural network

Author: Bergmeir Christoph Norbert
Penny Dimri Jahan C.
Publication venue: Plos One
Publication date: 30/08/2023
Field of study

Machine learning (ML) is increasingly applied to predict adverse postoperative outcomes in cardiac surgery. Commonly used ML models fail to translate to clinical practice due to absent model explainability, limited uncertainty quantification, and no flexibility to missing data. We aimed to develop and benchmark a novel ML approach, the uncertainty-aware attention network (UAN), to overcome these common limitations. Two Bayesian uncertainty quantification methods were tested, generalized variational inference (GVI) or a posterior network (PN). The UAN models were compared with an ensemble of XGBoost models and a Bayesian logistic regression model (LR) with imputation. The derivation datasets consisted of 153,932 surgery events from the Australian and New Zealand Society of Cardiac and Thoracic Surgeons (ANZSCTS) Cardiac Surgery Database. An external validation consisted of 7343 surgery events which were extracted from the Medical Information Mart for Intensive Care (MIMIC) III critical care dataset. The highest performing model on the external validation dataset was a UAN-GVI with an area under the receiver operating characteristic curve (AUC) of 0.78 (0.01). Model performance improved on high confidence samples with an AUC of 0.81 (0.01). Confidence calibration for aleatoric uncertainty was excellent for all models. Calibration for epistemic uncertainty was more variable, with an ensemble of XGBoost models performing the best with an AUC of 0.84 (0.08). Epistemic uncertainty was improved using the PN approach, compared to GVI. UAN is able to use an interpretable and flexible deep learning approach to provide estimates of model uncertainty alongside stateof- the-art predictions. The model has been made freely available as an easy-to-use web application demonstrating that by designing uncertainty-aware models with innately explainable predictions deep learning may become more suitable for routine clinical use.The ANZSCTS Cardiac Surgery Database Program is funded by the Department of Health (Victoria), the Clinical Excellence Commission (NSW)Queensland Health (QLD)ANZSCTS Database Research activities are supported through a National Health and Medical Research Council Principal Research Fellowship (APP 1136372)Program Grant (APP 1092642

Repositorio Institucional Universidad de Granada

Tree-based survival analysis improves mortality prediction in cardiac surgery

Author: Bergmeir Christoph Norbert
Penny Dimri Jahan C.
Publication venue: Frontiers
Publication date: 10/07/2023
Field of study

Objectives: Machine learning (ML) classification tools are known to accurately predict many cardiac surgical outcomes. A novel approach, ML-based survival analysis, remains unstudied for predicting mortality after cardiac surgery. We aimed to benchmark performance, as measured by the concordance index (C-index), of tree-based survival models against Cox proportional hazards (CPH) modeling and explore risk factors using the best-performing model. Methods: 144,536 patients with 147,301 surgery events from the Australian and New Zealand Society of Cardiac and Thoracic Surgeons (ANZSCTS) national database were used to train and validate models. Univariate analysis was performed using Student’s T-test for continuous variables, Chi-squared test for categorical variables, and stratified Kaplan-Meier estimation of the survival function. Three ML models were tested, a decision tree (DT), random forest (RF), and gradient boosting machine (GBM). Hyperparameter tuning was performed using a Bayesian search strategy. Performance was assessed using 2-fold cross-validation repeated 5 times. Results: The highest performing model was the GBM with a C-index of 0.803 (0.002), followed by RF with 0.791 (0.003), DT with 0.729 (0.014), and finally CPH with 0.596 (0.042). The 5 most predictive features were age, type of procedure, length of hospital stay, drain output in the first 4 h (ml), and inotrope use greater than 4 h postoperatively. Conclusion: Tree-based learning for survival analysis is a non-parametric and performant alternative to CPH modeling. GBMs offer interpretable modeling of non-linear relationships, promising to expose the most relevant risk factors and uncover new questions to guide future research.The ANZSCTS National Cardiac Surgery Database Program is funded by the Department of Health (Victoria)the Clinical Excellence Commission (NSW)Queensland Health (QLD)Cardiac surgical units participating in the registry. ANZSCTS Database Research activities are supported through a National Health and Medical Research Council Principal Research Fellowship (APP 1136372)Program Grant (APP 1092642

Repositorio Institucional Universidad de Granada

Efficient Cross-Validation of Echo State Networks

Author: C Bergmeir
H Jaeger
H Jaeger
H Jaeger
IB Yildiz
JW Taylor
M Lukoševičius
M Lukoševičius
M Stone
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 22/08/2019
Field of study

Echo State Networks (ESNs) are known for their fast and precise one-shot learning of time series. But they often need good hyper-parameter tuning for best performance. For this good validation is key, but usually, a single validation split is used. In this rather practical contribution we suggest several schemes for cross-validating ESNs and introduce an efficient algorithm for implementing them. The component that dominates the time complexity of the already quite fast ESN training remains constant (does not scale up with

k

) in our proposed method of doing

k

-fold cross-validation. The component that does scale linearly with

k

starts dominating only in some not very common situations. Thus in many situations

k

-fold cross-validation of ESNs can be done for virtually the same time complexity as a simple single split validation. Space complexity can also remain the same. We also discuss when the proposed validation schemes for ESNs could be beneficial and empirically investigate them on several different real-world datasets.Comment: Accepted in ICANN'19 Workshop on Reservoir Computin

arXiv.org e-Print Archive

Crossref

Prediction of donor splice sites using random forest with a new sequence encoding approach

Author: A Baten
A Dehzangi
A Liaw
A Zien
Atmakuri Ramakrishna Rao
BJ Blencowe
BJ Lam
C Bergmeir
C Burge
C Cortes
C Weihs
D Hand
D Meyer
G Yeo
H Drucker
J Huang
J Rajapakse
J Zhu
JL Li
L Breiman
M Khalilia
M Pertea
M Stone
MG Reese
MM Yin
MQ Zhang
N Sheth
P Jain
P Pollastro
Prabina Kumar Meher
R Staden
S Haykin
S Sören Sonnenburg
SE Hamby
T Mitchell
Tanmaya Kumar Sahu
TM Chen
WN Venables
X Roca
X Zhao
XF Zhang
Z Dominski
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

After the epidemic: Zika virus projections for Latin America and the Caribbean

Author: A Davidson
A Fauci
A Roth
A Tompkins
AJ Rodriguez-Morales
AM Samy
C Bergmeir
C Caminade
C Carlson
C Collucci
C Willmott
Carlos A. Peres
Christine Steiner São Bernardo
D Baud
D Clark
D Doolan
D Gao
D Miranda-Filho
D Musso
D Spracklen
D Watts
E Esu
E Nsoesie
E Oehler
E Pedrosa de Almeida Costa
E Worrall
F Colón-González
F Colón-González
F Piontek
Felipe J. Colón-González
G Barba-Spaeth
G Poveda
HJ Overgaard
I Harris
I Lake
Iain R. Lake
J Alfaro-Murillo
J Cohen
J Lessler
J Messina
J San Martín
KA Dowd
L Bowman
L Feldstein
M Aguiar
M Bouzid
M Johansson
M Teixeira
N Attar
N Faria
N Ferguson
NL Achee
O Brathwaite-Dick
O Troyanskaya
P Reiter
Paul R. Hunter
R Hijmans
R Hyndman
S Bhatt
S Funk
S Harrison
S Lindsay
S Lozano-Fuentes
S Suwanmanee
SJ Thomas
SK Desai
T dos Santos
T Perkins
Uwem Friday Ekpo
V Cao-Lormeau
W Dejnirattisai
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/11/2017
Field of study

Background: Zika is one of the most challenging emergent vector-borne diseases, yet its future public health impact remains unclear. Zika was of little public health concern until recent reports of its association with congenital syndromes. By 3 August 2017 ~217,000 Zika cases and ~3,400 cases of associated congenital syndrome were reported in Latin America and the Caribbean. Some modelling exercises suggest that Zika virus infection could become endemic in agreement with recent declarations from the The World Health Organisation. Methodology/Principal findings: We produced high-resolution spatially-explicit projections of Zika cases, associated congenital syndromes and monetary costs for Latin America and the Caribbean now that the epidemic phase of the disease appears to be over. In contrast to previous studies which have adopted a modelling approach to map Zika potential, we project case numbers using a statistical approach based upon reported dengue case data as a Zika surrogate. Our results indicate that ~12.3 (0.7–162.3) million Zika cases could be expected across Latin America and the Caribbean every year, leading to ~64.4 (0.2–5159.3) thousand cases of Guillain-Barré syndrome and ~4.7 (0.0–116.3) thousand cases of microcephaly. The economic burden of these neurological sequelae are estimated to be USD ~2.3 (USD 0–159.3) billion per annum. Conclusions/Significance: Zika is likely to have significant public health consequences across Latin America and the Caribbean in years to come. Our projections inform regional and federal health authorities, offering an opportunity to adapt to this public health challenge

Crossref

LSHTM Research Online

Directory of Open Access Journals

University of East Anglia digital repository

FigShare

Forecasting: theory and practice

Author: Apiletti D
Assimakopoulos V
Babai MZ
Barrow DK
Ben Taieb S
Bergmeir C
Bessa RJ
Bijak J
Boylan JE
Browell J
Carnevale C
Castle JL
Cirillo P
Clements MP
Cordeiro C
Cyrino Oliveira FL
De Baets S
Dokumentov A
Ellison J
Fiszeder P
Franses PH
Frazier DT
Gilliland M
Goodwin P
Grossi L
Grushka-Cockayne Y
Guidolin M
Guidolin M
Gunter U
Guo X
Guseo R
Gönül MS
Harvey N
Hendry DF
Hollyman R
Januschowski T
Jeon J
Jose VRR
Kang Y
Koehler AB
Kolassa S
Kourentzes N
Leva S
Li F
Litsiou K
Makridakis S
Martin GM
Martinez AB
Meeran S
Modis T
Nikolopoulos K
Paccagnini A
Panagiotelis A
Panapakidis I
Pavía JM
Pedio M
Pedregal DJ
Petropoulos F
Pinson P
Ramos P
Rapach DE
Reade JJ
Rostami-Tabar B
Rubaszek M
Sermpinis G
Shang HL
Spiliotis E
Syntetos AA
Talagala PD
Talagala TS
Tashman L
Thomakos D
Thorarinsdottir T
Todini E
Trapero Arenas JR
Wang X
Winkler RL
Yusupova A
Ziel F
Önkal D
Publication venue: 'Elsevier BV'
Publication date: 20/01/2022
Field of study

Forecasting has always been at the forefront of decision making and planning. The uncertainty that surrounds the future is both exciting and challenging, with individuals and organisations seeking to minimise risks and maximise utilities. The large number of forecasting applications calls for a diverse set of forecasting methods to tackle real-life challenges. This article provides a non-systematic review of the theory and the practice of forecasting. We provide an overview of a wide range of theoretical, state-of-the-art models, methods, principles, and approaches to prepare, produce, organise, and evaluate forecasts. We then demonstrate how such theoretical concepts are applied in a variety of real-life contexts. We do not claim that this review is an exhaustive list of methods and applications. However, we wish that our encyclopedic presentation will offer a point of reference for the rich work that has been undertaken over the last decades, with some key insights for the future of forecasting theory and practice. Given its encyclopedic nature, the intended mode of reading is non-linear. We offer cross-references to allow the readers to navigate through the various topics. We complement the theoretical concepts and applications covered by large lists of free or open-source software implementations and publicly-available databases

UCL Discovery

The importance of climatic factors and outliers in predicting regional monthly campylobacteriosis risk in Georgia, USA

Author: A Lal
A. Odoi
AJ Conlan
B. Rohrbach
BM Allos
C Bergmeir
CC Tam
D Schönberg-Norio
E Scallan
EJ Vereen
FF Nobre
G Miller
G Nylen
GD Williamson
HS Burkom
I Dahoo
J Benschop
J Hintze
J Weisent
J. Weisent
K Ekdahl
K Jones
K Nygard
K Obiri-Danso
K Stanley
K Stanley
LC Alwan
LW Sinton
M Fleury
M Hearnden
M Hutchison
M Patrick
MF Myers
MR Jepsen
MT Guerin
P Bi
R Allard
R Eyles
R Kovats
RJ Meldrum
S Altizer
S Hartnack
S Makridakis
SA DeLurgio
SAS Institute
SF Altekruse
T Humphrey
W Snelling
W. Seaver
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

How to evaluate sentiment classifiers for Twitter time-ordered data?

Author: A Bifet
AD McQuarrie
C Bergmeir
C Bergmeir
C Bergmeir
CJ Van Rijsbergen
E Ikonomovska
F Wilcoxon
Frank Emmert-Streib
I Mozetič
Igor Mozetič
J Demšar
J Racine
Jasmina Smailović
K Krippendorff
L Gaudette
LJ Tashman
Luis Torgo
M Friedman
M Friedman
N Moniz
OD Anderson
R Fildes
RL Iman
S Arlot
S Kiritchenko
Vitor Cerqueira
VN Vapnik
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2018
Field of study

Social media are becoming an increasingly important source of information about the public mood regarding issues such as elections, Brexit, stock market, etc. In this paper we focus on sentiment classification of Twitter data. Construction of sentiment classifiers is a standard text mining task, but here we address the question of how to properly evaluate them as there is no settled way to do so. Sentiment classes are ordered and unbalanced, and Twitter produces a stream of time-ordered data. The problem we address concerns the procedures used to obtain reliable estimates of performance measures, and whether the temporal ordering of the training and test data matters. We collected a large set of 1.5 million tweets in 13 European languages. We created 138 sentiment models and out-of-sample datasets, which are used as a gold standard for evaluations. The corresponding 138 in-sample datasets are used to empirically compare six different estimation procedures: three variants of cross-validation, and three variants of sequential validation (where test set always follows the training set). We find no significant difference between the best cross-validation and sequential validation. However, we observe that all cross-validation variants tend to overestimate the performance, while the sequential methods tend to underestimate it. Standard cross-validation with random selection of examples is significantly worse than the blocked cross-validation, and should not be used to evaluate classifiers in time-ordered data scenarios

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

Digital repository of Slovenian research organizations

Machine Learning Algorithms for Predicting and Risk Profiling of Cardiac Surgery-Associated Acute Kidney Injury

Author: Bergmeir C.
Cochrane A.D.
Penny-Dimri J.C.
Reid Christopher
Smith J.A.
Williams-Spence J.
Publication venue: ELSEVIER INC
Publication date: 01/01/2021
Field of study

Using a large national database of cardiac surgical procedures, we applied machine learning (ML) to risk stratification and profiling for cardiac surgery-associated acute kidney injury. We compared performance of ML to established scoring tools. Four ML algorithms were used, including logistic regression (LR), gradient boosted machine (GBM), K-nearest neighbor, and neural networks (NN). These were compared to the Cleveland Clinic score, and a risk score developed on the same database. Five-fold cross-validation repeated 20 times was used to measure the area under the receiver operating characteristic curve (AUC), sensitivity, and specificity. Risk profiles from GBM and NN were generated using Shapley additive values. A total of 97,964 surgery events in 96,653 patients were included. For predicting postoperative renal replacement therapy using pre- and intraoperative data, LR, GBM, and NN achieved an AUC (standard deviation) of 0.84 (0.01), 0.85 (0.01), 0.84 (0.01) respectively outperforming the highest performing scoring tool with 0.81 (0.004). For predicting cardiac surgery-associated acute kidney injury, LR, GBM, and NN each achieved 0.77 (0.01), 0.78 (0.01), 0.77 (0.01) respectively outperforming the scoring tool with 0.75 (0.004). Compared to scores and LR, shapely additive values analysis of black box model predictions was able to generate patient-level explanations for each prediction. ML algorithms provide state-of-the-art approaches to risk stratification. Explanatory modeling can exploit complex decision boundaries to aid the clinician in understanding the risks specific to individual patients

espace@Curtin